Like other branches of philosophy, epistemology faces a methodological challenge: to make its procedures more robust, so that errors are more easily corrected. The problem is not any general unsoundness in its basic methods, but difficulty in recovering from errors once made. For example, thought experimentation is a legitimate and ordinary way of learning what would have been in counterfactual possibilities, as I have argued elsewhere (Williamson, 2007). Nevertheless, it is fallible, just as sense perception, memory and reasoning are fallible. Although the communal nature of epistemology means that merely idiosyncratic mistakes in designing and conducting a thought experiment tend to be quickly recognized, that is not the only possible kind of error. A particular thought experiment may exploit a glitch in human cognitive psychology to which we are all prone. For instance, the set-up may prompt us unconsciously to apply a common heuristic in conditions beyond its range of reliability. How can we guard against the danger of wrongly dismissing a correct theory as refuted because it conflicts with a heuristic in such a case? The answer is not to stop using all cognitive faculties which can lead us astray, for that would leave us with nothing. We need, not ways of never making mistakes, but ways of recovering from our mistakes once made. We can do better by using more than one method. When diverse methods lead us to the same conclusion, we have a more robust basis for endorsing it. When they lead to conflicting conclusions, we have a warning sign that something has gone wrong. Each method acts as a potential corrective to the others. For the method of thought experimentation, an appropriate corrective is the method of formal model building—and vice versa (Williamson, 2017). Thus, formal models provide independent confirmation for the morals of Gettier thought experiments (Williamson, 2013). The issue of robustness also arises within the model-building approach. Some features of a model may result from arbitrary or at least unforced choices; a better model may avoid them. We can have more confidence in features which are somehow persistent: more specifically, features which provably follow from appropriate general constraints on models. This paper exemplifies that approach, applying it to counter-models to the controversial ‘KK’ principle that whenever you know p, you also know that you know p. It is to be hoped that a similar approach can also be applied to other questions in epistemology. The counter-models to KK are inspired by cases of the following sort. Notoriously, perceptual indiscriminability is non-transitive. Imagine a long line of people, arranged in height from the tallest to the shortest, standing some distance away. They form a sorites-like series. For all you can see of any two successive members, they are exactly the same height; thus, they are visually indiscriminable in height for you (in those circumstances). Yet, the first and the last members are not visually indiscriminable in height for you: you can easily see that the former is much taller than the latter. Thus, on pain of contradiction, for some three people A, B and C in the series, A is visually indiscriminable in height from B, and B is visually indiscriminable in height from C, but A is visually discriminable in height from C. On the intended interpretation of standard possible world models of epistemic logic, a world x is accessible from a world w just in case, for all one knows in w, one is in x—in other words, whatever one knows in w is true in x. Thus, accessibility is understood as a kind of indiscriminability. Since knowledge is factive, whatever one knows in w is true in w, so accessibility is reflexive: no world can be discriminated from itself. One counts as knowing a proposition in w just in case the proposition is true in every world accessible from w. As is well known, the transitivity of accessibility corresponds to the validity of the KK principle.1 An obvious thought is therefore to use instances of the non-transitivity of perceptual indiscriminability to construct natural, somewhat realistic models of epistemic logic which falsify the KK principle, because their accessibility relation is non-transitive. The task is not trivial, since non-transitive indiscriminability between objects of perception has to be leveraged into non-transitive indiscriminability between worlds, without sacrificing too much naturalness or realism. Nevertheless, it can be done (Williamson, 1992, 2000). One bonus of working with indiscriminability is that the counterexamples also work against a more defensible weakening of KK which merely requires that whenever you know p, you are at least in a position to know that you know p. For indiscriminability is not just incompatible with knowing the relevant difference in the relevant way; it is also incompatible with being in a position to know the difference in that way. Of course, standard epistemic logic is widely regarded as quite unrealistic. Specifically, its models validate an unrestricted principle of multi-premise closure for knowledge: whatever truths you know, you also know whatever truths follow from them (in the epistemic logic)—with no qualifying condition to the effect that you have carried out the deduction, or anything like that. Such logical omniscience seems way beyond the computational powers of even the cleverest mortals. One response is to try to gloss ‘know’ in some way that automatically validates unrestricted multi-premise closure, though that risks changing the meaning of ‘KK’ and leaving the original principle intact. Another response is to isolate just the required bits of the model and present them as an informal description of an ordinary scenario, without commitment to an unrestricted closure principle. A third attitude is just to admit that the model characterizes a drastically idealized agent, but to argue that if even such an idealized agent is susceptible to failures of KK, we can hardly hope to do better. Satisfying KK is not a booby prize for those who are bad at logic. Recently, there has been some pushback in defence of KK, with its epicentre at MIT (Greco, 2014; Stalnaker, 2015; Das & Salow, 2018; Goodman & Salow, 2018; Dorst, 2019, but also McHugh, 2010). Notably, much of this movement has developed within an epistemological approach not unsympathetic in spirit to that of the original critique of KK: a broadly externalist, reliabilist conception of knowledge, a willingness to treat knowledge on its own terms rather than reduce it to some sort of belief with privileges, and an openness to applying the techniques of formal epistemology. A generic feature of the methodology of formal model building is that when one argues for the possibility of a phenomenon (such as KK failure) on the grounds that simple models of some type of situation predict it, someone can counter by building more complex models of such situations which do not predict the phenomenon. That has in effect happened for KK failure. Of course, if the extra complications look gerrymandered just to avoid the prediction, the counter is open to the objection that it is ad hoc, and not scientifically respectable: that has to be judged case by case. In any case, it would be nice to generalize the original argument for the possibility of the phenomenon, by showing the prediction to be robust because it follows from quite general features of the model, which should be preserved even as further dimensions of complexity are added: it is no mere artefact of the extreme simplicity of the original model. The next three sections do that for one type of counter-model to KK. The final section uses that result to reflect on recent defences of KK. Clock models were originally proposed to argue for a phenomenon much more drastic than KK failure: the possibility of knowing p while it is almost certain on one's own present evidence (or knowledge) that one does not know p (Williamson, 2011, 2014). The failure of KK is in effect a corollary of that phenomenon. However, working with a probability distribution requires specifying much more structure over the model, in a way which in this case is hard to generalize in an adequately motivated way to the desired extent, especially for models with infinitely many worlds. We therefore aim at a more modest target, simplifying matters by considering the original model just as a proposed counter-model to KK. Imagine looking from some distance at an unmarked circular clock face with just one rotating hand. That is your only source of knowledge of the hand's position (for example, 3 o'clock). Your question is where the hand is pointing (what time it is pointing at), not whether the clock is accurate. We treat the non-epistemic possibilities as simply positions or points on a circle. By looking, you gain some knowledge of the hand's position—you can rule out some of those possibilities—but you do not gain full knowledge—you cannot rule out all of them except the actual one. The original model was discrete, with finitely many equally spaced positions for the hand to be at, in order to avoid tractable but messy complications with probability distributions over an infinite probability space. For brevity, we simply write ‘positions’ for the positions available for the hand to be at. Since we have put probability aside here, infinite models raise no special problem. We can have a continuous model, with positions on a segment of the circle ordered like the real numbers in a bounded interval, which is geometrically more natural. If we want to denote each position with a finite string of symbols from a finite alphabet, we can make the set Θ of positions countable, with positions on a segment ordered like the rational numbers in a bounded interval. The choice between integer-valued, real-valued and rational-valued positions makes little difference to the argument below (though it would need adjustment if we allowed infinitesimal distances). Irrespective of that choice, we describe Θ as a circle. What matters is that the positions are too close together for the observer to see exactly which one the hand is at, and that they are evenly spaced, so that Θ has rotational and reflective symmetry, like a circle. Thus, for any positions θ1 and θ2, some structure-preserving mappings of Θ onto itself map θ1 to θ2. The observer's powers of visual discrimination are assumed to show the same symmetries. They are just as good around θ1 as they are around θ2, and just as good clockwise as anticlockwise. Whether the observer can discriminate θ1 from θ2 depends only on the angle they subtend at the centre of the circle. In this simple case, the indiscriminability relation between positions constitutes the accessibility relation between worlds, for purposes of epistemic logic. Strictly speaking, in the terminology of epistemic and modal logic, this set-up is a frame rather than a model, because it specifies a set of worlds and an accessibility relation over that set, but no specific propositions to interpret the atomic formulas of a formal language for epistemic logic. No such formal language was needed for the purposes at hand. To avoid confusion, we respect the distinction between frames and models in what follows. A result in Section 4 is best articulated with reference to a formal language. The original frame is very simple. The worlds are simply the positions themselves. The worlds accessible from a given world, in effect the positions indiscriminable from the given position, form an interval of constant length centred on the given position. For ease of visualization, we may assume that the interval occupies a comparatively small proportion of the circle. Start from a position θ1; let θ2 be a position within the interval of indiscriminability around θ1 but near its edge going clockwise; and let θ3 be a position within the interval of indiscriminability around θ2 but near its edge going clockwise. Then, θ3 is not within the interval of indiscriminability around θ1. In other words, θ2 is accessible from θ1, and θ3 is accessible from θ2, but θ3 is not accessible from θ1. Thus, accessibility is non-transitive, so the KK principle fails in the frame. To be more specific, let X be the proposition true at just the worlds accessible from θ1; thus, X is the strongest truth one knows at θ1—at θ1, one knows X, and X entails everything else one knows. Then, as an observer in θ1, one knows X but does not know that one knows X. Although the original frame is very natural, it involves some drastic simplifications, on top of those already built into the standard semantic framework for epistemic logic. By identifying worlds with positions, it restricts the observer's ignorance of epistemic matters to ignorance induced by ignorance of non-epistemic matters. In the frame, no two epistemic possibilities alike in the position of the hand differ in what the observer knows. For example, the observer in effect knows the exact length of the interval of indiscriminability, since it is constant from one world to another. By contrast, in real life, observers are typically ignorant of some such structural features of their own cognitive powers. Another special feature of the original frame is that its accessibility relation is symmetric: whenever θ1 is indiscriminable from θ2, θ2 is indiscriminable from θ1. Yet, non-symmetric accessibility relations are needed in epistemology, for example, to model a sceptical scenario without letting scepticism infect the corresponding non-sceptical scenario. In the bad case, for all one knows one is in the good case, but in the good case one knows things incompatible with being in the bad case. Thus, the worry arises that introducing extra structure into the frame to make it more realistic in such respects may somehow disrupt the counterexamples to KK. To counter that worry, we need to generalize. We do so by considering all frames which generalize simple clock frames in the following way. We still have a set of worlds W, but we no longer equate it with the circle Θ. However, we can map W onto Θ: for each world w, [w] is the position of the hand in w. As usual, propositions are equated with subsets of W. For any position θ, Pθ is the proposition that the hand is at θ; in other words, Pθ = {w W: [w] = θ}. The frame encodes knowledge in the usual way, with a dedicated accessibility relation R. For any proposition X, KX is the proposition that (as the observer) one knows X; as usual, KX is true if and only if X is true in all accessible worlds, so KX = {w W: x(if Rwx, x X)}. But R can no longer be defined simply by angular distance, since that distance may not determine all the differences between worlds. When [w] = [x] but w ≠ x, whether x is accessible from w may depend on epistemic differences between w and x, or even on non-epistemic differences, for we allow a frame to have additional structure beyond W and R, although W and R are our ultimate concern. We do not require R to be symmetric. In other words, the hand's position in the result of applying the automorphism to a world is the result of applying the corresponding rotation to the hand's position in the original world. For present purposes, rotating clockwise through 90° counts as the same rotation as rotating anticlockwise through 270°, rotating through 360° counts as the same rotation as rotating through 0°, and so on, because they have the same effect. Rotations are functions, not temporal processes. Rotations of the circle induce rotational automorphisms of the frame <W, R> in a natural way, so that the rotational automorphisms inherit much—though not all—of the structure of the rotations themselves (Section 4 will explain in detail one general way for this to happen). In particular, this concerns their algebraic structure.The rotations of the circle form a group in the mathematical sense, under the operation of functional composition. There is an identity rotation 1 which leaves everything where it was: 1(θ) = θ for every position θ. Every rotation r has a two-sided inverse rotation r−1 which cancels out r: r−1(r(θ)) = r(r−1(θ)) = θ. Rotations can be composed: applying first r1 and then r2 is equivalent to applying the rotation r2r1: r2r1(θ) = r2(r1(θ)). Thus 1r = r1 = r and rr−1 = r−1r = 1. The group of rotations of the circle induces a corresponding group of automorphisms of the frame. We call such automorphisms r* of the frame rotational automorphisms. The frame may have non-rotational automorphisms too, but we are less interested in them. The automorphisms of any structure automatically form a group, under composition (which is always associative in the mathematical sense), and in this case, the rotational automorphisms form a subgroup of that group. Its group operations are induced by those of the group of rotations. For any rotations r1 and r2, the rotational automorphism (r2r1)* is just the composition of r1* and r2*: (r2r1)*(w) = r2*(r1*(w)). Consequently, the rotational automorphism 1* is the identity automorphism, for 1*r* = (1r)* = r* for any rotation r. Similarly, r−1* is the two-sided inverse of r*, for r−1*r* = (r−1r)* =1* = (rr−1)* = r*r−1*. Since the rotational automorphisms form a subgroup, we can define an equivalence relation E over W by setting Ewx just in case for some rotation r, r*(w) = x. E is reflexive because the identity is a rotational automorphism; it is symmetric because rotational automorphisms are closed under inverses; it is transitive because they are closed under composition. Thus, E partitions W into mutually exclusive, jointly exhaustive subsets, called orbits. Studying epistemic phenomena on a fixed orbit will turn out fruitful. A distinctive feature of the circle is that for any positions η and θ, exactly one rotation r is such that r(η) = θ. The epistemic frame need not inherit that feature. Typically, some worlds differ from each other structurally with respect to R so that no automorphism of the frame maps one to the other. Then, W divides into more than one orbit. However, the frame does inherit the feature of uniqueness from the circle: at most one rotational automorphism maps a world w to a world x. For if r1*(w) = r2*(w), then [r1*(w)] = [r2*(w)], so by (1a) r1([w]) = r2([w]), so r1 = r2 by uniqueness for the circle. As a corollary, the * function is one-one: whenever r1* = r2*, r1 = r2, so distinct rotations never induce the same rotational automorphism. This also means that for any worlds w and x in the same orbit, there is a unique rotation r of the circle such that r*(w) = x. We can therefore use the size of the angle (in degrees) subtended by positions θ and r(θ) at the centre of the circle (which is independent of θ) as a natural numerical measure of the ‘distance’ |w, x| between the worlds w and x. Of course, this measure is defined only for worlds in the same orbit. For when the worlds differ only by rotation, and y is at least as close to w as x is, then x is indiscriminable from w only if y is indiscriminable from w. In that way, discriminability of worlds from a given world increases with distance from that world. In other words, the accessibility relation is invariant under rotational automorphisms: they preserve the epistemic structure of the frame. This is just part of what is meant by saying that r* is an automorphism of the frame. If the frame has more structure on W, beyond R, we should add analogues of (1c) for the additional properties, relations or functions. Again, these invariance principles only mean that the frame does not introduce arbitrary deviations from the rotational symmetry of the underlying set-up. Crucially, the postulated symmetries are of the frame as a whole, not within each world in the frame. The most obvious reason for this is that in each world w, the hand is at a specific position [w], which breaks the intra-world symmetry between [w] and any other position r([w]). Rotational symmetry is inter-world symmetry: in another world r*(w), the hand is at r([w]), as in (1a). More subtly, a psychological bias may break the intra-world symmetry: for example, its effect may be that one is much better at discriminating around 6 o'clock and 12 o'clock than around 3 o'clock and 9 o'clock. That is quite consistent with inter-world rotational symmetry at the level of the frame, which merely implies that the frame contains some other world in which one has a psychological bias whose effect is that one is correspondingly better at discriminating around 3 o'clock and 9 o'clock than around 6 o'clock and 12 o'clock. We do not assume that r*(w) is causally the closest world to w in which the position of the hand is r([w]). In the case just considered, the psychological bias may be quite deeply rooted, and hard to ‘rotate’. Thus, if w is the world in which the hand is at 6 o'clock and one's bias favours 6 o'clock and 12 o'clock, causally the closest world in which the hand is at 9 o'clock may be one in which one's bias still favours 6 o'clock and 12 o'clock, not one in which it has been rotated to favour 3 o'clock and 9 o'clock, whereas the world r*(w) is of the latter kind. Something is wrong with any view of knowledge which cannot handle such naturally symmetric frames. The remaining two constraints are stated with respect to a fixed world z, which we informally envisage as a typical instance of the more ‘normal’ or ‘realistic’ worlds in the frame, while allowing that the frame may also contain less realistic, more abnormal worlds, perhaps as merely epistemic possibilities, or something even more distant than that. Thus, the constraints on z are not constraints on all worlds in the frame. In particular, since K is factive, z W−Pθ, so [z] ≠ θ. By (1d), in z one can rule out some candidate positions for the hand; looking at the clock was not a complete waste of time. The natural alternative to (1d) would be some form of scepticism about the external world. In other words, as the observer in z, one cannot discriminate the world one is in from the result of slightly rotating it. By condition (1c), s*(z) is a world with the same epistemic structure as z: it differs from z only in ways consequent on a slight difference in the position of the hand. This is consistent with there being other worlds in the frame where the hand has the same position as in s*(z), but the structure of the observer's knowledge differs markedly from that in z: the observer in z may be able to rule out being in one of the latter worlds, but cannot rule out being in s*(z). Notably, (1e) allows the observer's powers of discrimination to be greater in some worlds than in others, but then, the former cannot be rotated to the latter. For reasons already explained, s*(z) may not be causally the closest world to z in which the position of the hand is s([z]); the latter world may be one in which some psychological bias in z has not been rotated by the analogue of s. Nevertheless, epistemically, for a small enough rotation s, the observer in z cannot discriminate their world from one just like it but correspondingly slightly rotated by s*. The natural alternative to (1e) would be an implausible kind of omniscience about the relevant aspect of the world, for if (1e) fails, the observer in z—as well as being logically omniscient—can in effect discriminate their position in the world from an arbitrarily close and structurally exactly similar position, differing only in ways correlated with an arbitrarily slight difference in the real and apparent positions of the hand. Neither normal vision nor normal introspection has so perfect a power of discrimination. Given (1a-1e), we can prove that R is non-transitive. For let z, θ and s verify (1d-1e). Then, for some small enough rotation r and some natural number n ≥ 1, θ = rn([z]) (where rn is n iterations of r) and |z, s*(z)| ≥ |z, r*(z)| (we assume the Archimedean property of the standard real numbers: the argument would not work in its present form if |z, s*(z)| were infinitesimal). Then, Rzr*(z) by (1b). So, for each m < n, by (1c) Rr*m(z)r*m(r(z)), i.e. Rr*m(z)r*(m+1)(z). Suppose that R is transitive. Then, Rr*0(z)r*n(z), i.e. Rzr*n(z). Thus, by (1d) and the definition of K, r*n(z) W−Pθ. Hence, by definition of Pθ, [r*n(z)] ≠ θ. But [r*n(z)] = rn([z]) by n applications of (1a), and r was chosen such that θ = rn([z]), so [r*n(z)] = θ. This is a contradiction. Thus, R is non-transitive. QED. By similar reasoning, without assuming that R is transitive, we can show that n ≥ 1 and Kn(W−Pθ) is false at z (where Kn is n iterations of K). But K(W−Pθ) is true at z by (1d). Thus, for some m ≥ 1, Km(W−Pθ) is true at z while Km+1(W−Pθ) is false at z. In other words, K(Km−1(W−Pθ)) is true at z while KK(Km−1−Pθ)) is false at z, so in the frame an instance of the KK principle fails at the realistic world z itself, not just at some ‘far-out’ world. The result shows that arguing against KK from the original frame did not crucially depend on its very simple structure. A version of the argument goes through for a wide range of much richer and more realistic frames. What it takes is just rotational symmetry combined with the idea that, in the envisaged circumstances, perception provides some knowledge but not complete knowledge of the hand's position (on the current orbit). Although rotational symmetry is of course a simplification, rejecting it looks like an ad hoc move of the most scientifically unproductive kind. To see how much the constraints (1a-1e) allow, we can construct a wide range of epistemic frames satisfying them all. Indeed, this section establishes a more precise result: if a formula in a standard language of single-agent epistemic logic is invalid on some reflexive frame, it is invalid on some reflexive frame satisfying (1a-1e). In that sense, the constraints are compatible with a wide range of basic epistemic phenomena. Here, K operates on sentences while K operates on propositions, as defined in Section 3. A formula A is valid over a frame <W, R> if and only if for every interpretation I over <W, R>, I(A) = W; in other words, A is true at every world on every interpretation. Thus, <W, R> invalidates A if and only if I(A) ≠ W for some interpretation I over <W, R>. The aim is to show that if a formula is invalid on some epistemic frame, it is invalid on some epistemic frame satisfying the constraints (1a-1e). The first step is to provide a way of combining a simple unmarked clock frame < Θ, Δ> with an arbitrary epistemic frame <W, R> to make a new epistemic frame <W Θ, R Δ>. Informally, a new world in W Θ is simply a world in W together with a position for the clock's hand. One new world is accessible from another just in case both component worlds of the former are accessible in the relevant sense from the respective component worlds of the latter: to discriminate between new worlds, the observer must discriminate either between their first components or between their second components. About <W, R>, we assume only that R is reflexive, as above. We first describe the simple clock frame <Θ, Δ> more carefully and check that it satisfies (1a-1e). For simplicity, we take Θ to be the set of points on a circle in standard Euclidean space, though it can easily be adjusted to allow Θ to be countable or finite. Since the worlds in Θ are just positions, we set [θ] = θ and r* = r. Thus, (1a) holds trivially in <Θ, Δ>. We measure the distance |η, θ| between positions η and θ by the relevant angle in degrees; thus, 0 ≤ |η, θ| ≤180. We fix a constant c such that 0 < c < 180 and define the indiscriminability relation Δ to hold between η and θ just in case |η, θ| ≤ c. Thus, Δ is reflexive, holds between some but not all pairs of positions on the circle and obviously satisfies the monotonicity condition (1b). It also satisfies the invariance condition (1c) since |r(η), r(θ)| = |η, θ| for any rotation r. Now let z be any position and θ the position diametrically opposite z, so |z, θ| =180, so Δzθ fails (since c < 180). Moreover, Pθ = {θ}, so z K(W−Pθ), so <Θ, Δ> satisfies (1d). Moreover, if s is a rotation through c degrees, s ≠ 1 (since 0 < c) and |z, s(z)| = c, so Δzs(z), so (1e) holds too. Thus, the simple clock frame <Θ, Δ> is reflexive and satisfies all the constraints (1a-1e). Given these definitions, we can easily check that <W Θ, R Δ> satisfies constraints (1a-1c), because <Θ, Δ> already does. Now let <w, z> be any new world. We first check (1d) for <W Θ, R Δ>. In checking it for <Θ, Δ>, we already established that Δzθ fails for some position θ. Thus, (R Δ)<w, z><x, θ> also fails for any x in W, so W−Pθ is true at any new world to which <w, z> has R Δ, so K(W−Pθ) is true at <w, z>, as (1d) requires. Now, we check (1d) for <W Θ, R Δ>. In checking it for <Θ, Δ>, we showed Δzs(z) for some rotation s ≠ 1; since R is reflexive by hypothesis, (R Δ)<w, z><w, s(z)>, as (1e) requires. Moreover, since both R and Δ are reflexive, so is R Δ. Thus, the new frame <W Θ, R Δ> is reflexive and satisfies the constraints (1a-1e). The next task is to show that if a formula is invalid on <W, R>, it is also invalid on <W Θ, R Δ>. By contrast, the converse is not generally true. For example, the formula Kp KKp (where p is an atomic formula), which expresses the KK principle, is invalid on <W Θ, R Δ>, by what has already been proved. But we can choose R to be a transitive relation, making the formula valid on <W, R>. In the third equation, note that strictly speaking ‘K’ expresses an operation on subsets of W on the left-hand side but an operation on subsets of W Θ on the right-hand side; in practice, context always resolves the ambiguity. The reader is spared the routine checks that the three equations hold. Thus, Itr is an interpretation of the language over <W Θ, R Δ>. We can now complete the proof of the desired result. Suppose that an epistemic frame <W, R> invalidates a formula A. T